Picture for Mike Zheng Shou

Mike Zheng Shou

ShowUI-Aloha: Human-Taught GUI Agent

Add code
Jan 12, 2026
Viaarxiv icon

FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection

Add code
Jan 07, 2026
Viaarxiv icon

ShowUI-$π$: Flow-based Generative Models as GUI Dexterous Hands

Add code
Dec 31, 2025
Viaarxiv icon

Factorized Learning for Temporally Grounded Video-Language Models

Add code
Dec 30, 2025
Viaarxiv icon

Mitty: Diffusion-based Human-to-Robot Video Generation

Add code
Dec 19, 2025
Viaarxiv icon

EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models

Add code
Dec 16, 2025
Viaarxiv icon

OmniPSD: Layered PSD Generation with Diffusion Transformer

Add code
Dec 10, 2025
Viaarxiv icon

H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos

Add code
Dec 10, 2025
Figure 1 for H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos
Figure 2 for H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos
Figure 3 for H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos
Figure 4 for H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos
Viaarxiv icon

Computer-Use Agents as Judges for Generative User Interface

Add code
Nov 19, 2025
Viaarxiv icon

AUTO-Explorer: Automated Data Collection for GUI Agent

Add code
Nov 09, 2025
Viaarxiv icon